Search CORE

401 research outputs found

Statement of accounting principles

Author: Hatfield Henry Rand
Moore Wm.
Sanders Thomas H.
Publication venue: eGrove
Publication date: 01/01/1938
Field of study

American Institute of Accountants

eGrove (Univ. of Mississippi)

Roto-Translation Covariant Convolutional Networks for Medical Image Analysis

Author: DC Cireşan
EJ Bekkers
I Arganda-Carreras
J Staal
M Veta
MW Lafarge
R Duits
WM Rand
Publication venue
Publication date: 01/01/2018
Field of study

We propose a framework for rotation and translation covariant deep learning using

SE(2)

group convolutions. The group product of the special Euclidean motion group

SE(2)

describes how a concatenation of two roto-translations results in a net roto-translation. We encode this geometric structure into convolutional neural networks (CNNs) via

SE(2)

group convolutional layers, which fit into the standard 2D CNN framework, and which allow to generically deal with rotated input samples without the need for data augmentation. We introduce three layers: a lifting layer which lifts a 2D (vector valued) image to an

SE(2)

-image, i.e., 3D (vector valued) data whose domain is

SE(2)

; a group convolution layer from and to an

SE(2)

-image; and a projection layer from an

SE(2)

-image to a 2D image. The lifting and group convolution layers are

SE(2)

covariant (the output roto-translates with the input). The final projection layer, a maximum intensity projection over rotations, makes the full CNN rotation invariant. We show with three different problems in histopathology, retinal imaging, and electron microscopy that with the proposed group CNNs, state-of-the-art performance can be achieved, without the need for data augmentation by rotation and with increased performance compared to standard CNNs that do rely on augmentation.Comment: 8 pages, 2 figures, 1 table, accepted at MICCAI 201

arXiv.org e-Print Archive

Crossref

Pure OAI Repository

SMART: Unique splitting-while-merging framework for gene clustering

Author: A Thalamuthu
AD Lanterman
AE Teschendorff
AK Jain
Asoke K. Nandi
B Abu-Jamous
B Fritzke
B Fritzke
CR Lin
CS Wallace
D Dembele
D Jiang
David J. Roberts
G Celeux
H Akaike
J Qin
J Rissanen
KY Yeung
L Hubert
L Mavridis
L Zhao
MAT Figueiredo
P Tamayo
PT Spellman
R Xu
R Xu
RJ Cho
Rui Fa
S Bandyopadhyay
S Monti
S Wu
Sergio Gómez
T Kohonen
T Pramila
TR Golub
WM Rand
YJ Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2014
Field of study

Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

Recommended from our members

Genome-wide association study of primary open-angle glaucoma in continental and admixed African populations.

Primary open angle glaucoma (POAG) is a complex disease with a major genetic contribution. Its prevalence varies greatly among ethnic groups, and is up to five times more frequent in black African populations compared to Europeans. So far, worldwide efforts to elucidate the genetic complexity of POAG in African populations has been limited. We conducted a genome-wide association study in 1113 POAG cases and 1826 controls from Tanzanian, South African and African American study samples. Apart from confirming evidence of association at TXNRD2 (rs16984299; OR[T] 1.20; P = 0.003), we found that a genetic risk score combining the effects of the 15 previously reported POAG loci was significantly associated with POAG in our samples (OR 1.56; 95% CI 1.26-1.93; P = 4.79 × 10-5). By genome-wide association testing we identified a novel candidate locus, rs141186647, harboring EXOC4 (OR[A] 0.48; P = 3.75 × 10-8), a gene transcribing a component of the exocyst complex involved in vesicle transport. The low frequency and high degree of genetic heterogeneity at this region hampered validation of this finding in predominantly West-African replication sets. Our results suggest that established genetic risk factors play a role in African POAG, however, they do not explain the higher disease load. The high heterogeneity within Africans remains a challenge to identify the genetic commonalities for POAG in this ethnicity, and demands studies of extremely large size

eScholarship - University of California

Exploring the longitudinal dynamics of herd BVD antibody test results using model-based clustering

Author: A Komarek
A Reverter
A Reverter
C Genolini
C Heffernan
DC Koestler
F Brülisauer
GJ Gunn
JA Hartigan
JL Andrews
L Hubert
LG Fernandes
MC De Souto
N. Coffey
PD McNicholas
PD McNicholas
R. W. Humphry
S Vilcek
W Charoenlarp
WM Rand
X Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/08/2019
Field of study

Crossref

Edinburgh Research Explorer

SRUC - Scotland's Rural College

Genetic weighted k-means algorithm for clustering large-scale gene expression data

Author: CR Reeves
Fang-Xiang Wu
FX Wu
G Rudolph
G Sherlock
H Spath
J Hartigan
K Krishna
KS Al-Sultan
KY Yeung
L Hubert
LO Hall
LY Tseng
MT Laub
P Franti
P Scheunders
RO Duda
S Chu
S Dudoit
S Theodoridis
U Maulik
V Estivill-Castro
WM Rand
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

On negative results when using sentiment analysis tools for software engineering research

Author: AI Rousinopoulos
AJ Viera
Alexander Serebrenik
B Pang
B Vasilescu
C Giraud-Carrier
DW Zimmerman
E Brunner
F Konietschke
F Wilcoxon
FJ Shull
G Destefanis
J Cohen
JL Fleiss
KR Gabriel
L Hubert
M Hall
M Thelwall
M Thelwall
OJ Dunn
P Pritchard
P Tonella
Proshanta Sarkar
Robbert Jongeling
Subhajit Datta
WM Rand
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact of the choice of a sentiment analysis tool on software engineering studies by conducting a simple study of differences in issue resolution times for positive, negative and neutral texts. We repeat the study for seven datasets (issue trackers and Stack Overflow questions) and different sentiment analysis tools and observe that the disagreement between the tools can lead to diverging conclusions. Finally, we perform two replications of previously published studies and observe that the results of those studies cannot be confirmed when a different sentiment analysis tool is used

Repository TU/e

Crossref

Springer - Publisher Connector

Pure OAI Repository

Institutional Knowledge at Singapore Management University

Ranked Adjusted Rand: integrating distance and partition information in a measure of clustering agreement

Author: A Thalamuthu
B Larsen
C Silva-Costa
C Silva-Costa
D Steinley
DL Wallace
EB Fowlkes
FJ Rohlf
Francisco R Pinto
FX Wu
GW Milligan
H Chipman
H Li
HL Kundel
I Serrano
JA Carrico
JA Carrico
Jonas S Almeida
João A Carriço
L Hubert
M Meila
Mário Ramirez
PH Sneath
S van Dongen
WM Rand
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Biological information is commonly used to cluster or classify entities of interest such as genes, conditions, species or samples. However, different sources of data can be used to classify the same set of entities and methods allowing the comparison of the performance of two data sources or the determination of how well a given classification agrees with another are frequently needed, especially in the absence of a universally accepted "gold standard" classification. RESULTS: Here, we describe a novel measure – the Ranked Adjusted Rand (RAR) index. RAR differs from existing methods by evaluating the extent of agreement between any two groupings, taking into account the intercluster distances. This characteristic is relevant to evaluate cases of pairs of entities grouped in the same cluster by one method and separated by another. The latter method may assign them to close neighbour clusters or, on the contrary, to clusters that are far apart from each other. RAR is applicable even when intercluster distance information is absent for both or one of the groupings. In the first case, RAR is equal to its predecessor, Adjusted Rand (HA) index. Artificially designed clusterings were used to demonstrate situations in which only RAR was able to detect differences in the grouping patterns. A study with larger simulated clusterings ensured that in realistic conditions, RAR is effectively integrating distance and partition information. The new method was applied to biological examples to compare 1) two microbial typing methods, 2) two gene regulatory network distances and 3) microarray gene expression data with pathway information. In the first application, one of the methods does not provide intercluster distances while the other originated a hierarchical clustering. RAR proved to be more sensitive than HA in the choice of a threshold for defining clusters in the hierarchical method that maximizes agreement between the results of both methods. CONCLUSION: RAR has its major advantage in combining cluster distance and partition information, while the previously available methods used only the latter. RAR should be used in the research problems were HA was previously used, because in the absence of inter cluster distance effects it is an equally effective measure, and in the presence of distance effects it is a more complete one

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Improving cluster recovery with feature rescaling factors

Author: A Hatamlou
AK Jain
C Hennig
D Aloise
D Steinley
D Steinley
D Xu
E Lord
H-P Kriegel
H-P Kriegel
M Erisoglu
MCP de Souto
MM-T Chiang
R Panda
R Suzuki
R Ünlü
RC de Amorim
RC de Amorim
RC de Amorim
RC de Amorim
RL Melvin
WM Rand
X Li
Y Sun
Z Deng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2021
Field of study

The data preprocessing stage is crucial in clustering. Features may describe entities using different scales. To rectify this, one usually applies feature normalisation aiming at rescaling features so that none of them overpowers the others in the objective function of the selected clustering algorithm. In this paper, we argue that the rescaling procedure should not treat all features identically. Instead, it should favour the features that are more meaningful for clustering. With this in mind, we introduce a feature rescaling method that takes into account the within-cluster degree of relevance of each feature. Our comprehensive simulation study, carried out on real and synthetic data, with and without noise features, clearly demonstrates that clustering methods that use the proposed data normalization strategy clearly outperform those that use traditional data normalization

University of Essex Research Repository

Crossref

An adaptive version of k-medoids to deal with the uncertainty in clustering heterogeneous data using an intermediary fusion approach

Author: A Oliva
A Strehl
Aalaa Mojahed
B Khaleghi
Beatriz de la Iglesia
BV Dasarathy
D Hall
DJ Berndt
E Acar
G Salton
GRG Lanckriet
GRG Lanckriet
H-S Park
L Kaufman
L Kaufman
LR Dice
M Žitnik
MA Abidi
MH Vliet van
N-EE Faouzi
OA Akeem
P Pavlidis
RA Baeza-Yates
S Jaccard
TN Manjunath
TY Chan
WM Rand
Y Shi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This paper introduces Hk-medoids, a modified version of the standard k-medoids algorithm. The modification extends the algorithm for the problem of clustering complex heterogeneous objects that are described by a diversity of data types, e.g. text, images, structured data and time series. We first proposed an intermediary fusion approach to calculate fused similarities between objects, SMF, taking into account the similarities between the component elements of the objects using appropriate similarity measures. The fused approach entails uncertainty for incomplete objects or for objects which have diverging distances according to the different component. Our implementation of Hk-medoids proposed here works with the fused distances and deals with the uncertainty in the fusion process. We experimentally evaluate the potential of our proposed algorithm using five datasets with different combinations of data types that define the objects. Our results show the feasibility of the our algorithm, and also they show a performance enhancement when comparing to the application of the original SMF approach in combination with a standard k-medoids that does not take uncertainty into account. In addition, from a theoretical point of view, our proposed algorithm has lower computation complexity than the popular PAM implementation

Crossref

University of East Anglia digital repository